Understanding shapes

Understanding shapes#

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy.integrate import quad
from progressbar import progressbar as pbar
from rlxutils import subplots, copy_func
import pandas as pd
import seaborn as sns
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tfb = tfp.bijectors

%matplotlib inline

2022-03-13 17:17:57.687505: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-03-13 17:17:57.687542: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

observe that, besides batch_shape distributions have an associated event_shape, which in this case in empty, signalling that each distribution in the batch is a distribution of scalars.

d = tfd.Normal(loc=[-1,0], scale=1)
d

<tfp.distributions.Normal 'Normal' batch_shape=[2] event_shape=[] dtype=float32>

in general, picture yourself wanting to assigned different distributions to different data points in your datset.

Multivariate distributions#

They have event_shape. For instance, for a multivariate normal distribution we must specify a covariance matrix. Observe that this cannot be done using uniquely the batch_size, since in general variables of a multivariate normal are not independant

# Initialize a single 3-variate Gaussian.
mu = [1., 2]
cov = [[ 0.36,  0.12],
       [ 0.12,  0.29]]

mvn = tfd.MultivariateNormalTriL(
    loc=mu,
    scale_tril=tf.linalg.cholesky(cov))

s = mvn.sample(10000)
s.shape

TensorShape([10000, 2])

observe that dependance shows as a slanted plot

sns.displot( x = s[:,0], y = s[:,1], kind="kde", rug=False)
plt.axis("equal"); plt.grid();

../_images/ce6a6afa07f58b6ac918833f7b52b77588fdba9396981885f121755c7c9a892e.png

observe that this implies an event_shape

mvn

<tfp.distributions.MultivariateNormalTriL 'MultivariateNormalTriL' batch_shape=[] event_shape=[2] dtype=float32>

Combining `batch_shape` and `event_shape`#

altogether both determine the sample sizes. Observe how we use broadcasting of the covariance matrix to have a batch of three multivariate distributions of two variables each. The three distributions have the same covariance matrix.

# Initialize a single 3-variate Gaussian.
mu = [[1., 2],[3,4], [5,6]]

cov = [[ 0.36,  0.12],
       [ 0.12,  0.29]]

mvn = tfd.MultivariateNormalTriL(
    loc=mu,
    scale_tril=tf.linalg.cholesky(cov)
)

mvn

<tfp.distributions.MultivariateNormalTriL 'MultivariateNormalTriL' batch_shape=[3] event_shape=[2] dtype=float32>

s = mvn.sample(10000)
s.shape

TensorShape([10000, 3, 2])

np.min(mu)

1.0

for ax,i in subplots(s.shape[1], usizex=4):    
    sns.kdeplot( x = s[:, i, 0], y = s[:, i, 1], ax=ax)
    plt.axis("equal"); plt.grid();
    plt.xlim(np.min(mu)-2, np.max(mu)+2)
    plt.ylim(np.min(mu)-4, np.max(mu)+4)
plt.tight_layout()

../_images/fa1b95a2fd53199c74c40b5a8cacb8f371ff0a772b0191edae2a35d8f738cbcf.png

Making sense of shapes#

In general, samples from a distribution will have the shape [sample_shape, batch_shape, event_shape]

sample_shape is determined when you call the sample method of the distribution
batch_shape is determined by the parameters of the distribution when you create it
event_shape is detemrined by the nature of the multivariate distribution you use

This way:

samples: are independant identically distributed
batch components: are independant and NOT identically distributed
event components: are NOT independant and NOT identiclly distributed

The following table might help make sense of this. It is based on this post and this post.

\[\begin{split} \begin{array}{|c|c|c|c|l|}\hline \textbf{sample shape}&\textbf{batch shape}&\textbf{event shape} & \textbf{distribution for samples} \\\hline \texttt{[2]}&\texttt{[]} &\texttt{[]} & X_0, X_1 \overset{iid}{\sim} \mathcal{N}(\mu, \sigma)&\text{one single variable distribution}\\\hline \texttt{[2]}&\texttt{[3]} &\texttt{[]} & X_{0,0}, X_{1,0} \overset{iid}{\sim} \mathcal{N}(\mu_0, \sigma_0)&\text{three single variable distributions}\\ &&& X_{0,1}, X_{1,1} \overset{iid}{\sim} \mathcal{N}(\mu_1, \sigma_1)\\ &&& X_{0,2}, X_{1,2} \overset{iid}{\sim} \mathcal{N}(\mu_2, \sigma_2)\\\hline \texttt{[2]}&\texttt{[]} &\texttt{[4]} & \mathbf{X}_0, \mathbf{X}_1 \overset{iid}{\sim} \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\sigma})\;\;\;\;\;\mathbf{X}_i\in\mathbb{R}^4&\text{one mutivariate distribution}\\\hline \texttt{[2]}&\texttt{[3]} &\texttt{[4]} & \mathbf{X}_{0,0}, \mathbf{X}_{1,0} \overset{iid}{\sim} \mathcal{N}(\boldsymbol{\mu}_0, \boldsymbol{\sigma}_0)\;\;\;\;\;\mathbf{X}_i\in\mathbb{R}^4&\text{three multivariate distributions}\\ &&& \mathbf{X}_{0,1}, \mathbf{X}_{1,1} \overset{iid}{\sim} \mathcal{N}(\boldsymbol{\mu}_1, \boldsymbol{\sigma}_1)\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\\ &&& \mathbf{X}_{0,2}, \mathbf{X}_{1,2} \overset{iid}{\sim} \mathcal{N}(\boldsymbol{\mu}_2, \boldsymbol{\sigma}_2)\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\\\hline \end{array} \end{split}\]

observe how we deal with each case in the code below

# batch_shape=[], event_shape=[]

d = tfd.Normal(loc=0, scale=1)
print (d)
print (d.sample(2).numpy())

tfp.distributions.Normal("Normal", batch_shape=[], event_shape=[], dtype=float32)
[ 0.75803363 -1.8501624 ]

# batch_shape=[3], event_shape=[]     implicitly with vector parameters

d = tfd.Normal(loc=[0,1,2], scale=1) # broadcasting on batch same for scale
print (d)
print (d.sample(2).numpy())

tfp.distributions.Normal("Normal", batch_shape=[3], event_shape=[], dtype=float32)
[[ 1.1833676  1.7853057  5.493244 ]
 [-1.2918025  2.284341   0.9959991]]

# batch_shape=[], event_shape=[4]        explicitly with a multivariate distribution

mu = np.random.random(4)*10-5
cov = np.random.random(size=(4,4))+.1    # will broadcast covariance matrix
cov = tf.linalg.cholesky(cov.dot(cov.T)) # ensures positive definite matrix

d = tfd.MultivariateNormalTriL(mu,cov)

print (d)
print (d.sample(2).numpy())

tfp.distributions.MultivariateNormalTriL("MultivariateNormalTriL", batch_shape=[], event_shape=[4], dtype=float64)
[[-4.94196295  4.54800186 -0.1223484  -2.39742009]
 [-6.22349008  2.51441253 -2.61423195 -3.71563902]]

# batch_shape=[3], event_shape=[4]

mu = np.random.random((3,4))*10-5

cov = np.random.random(size=(4,4))+.1    # will broadcast covariance matrix
cov = tf.linalg.cholesky(cov.dot(cov.T)) # ensures positive definite matrix

d = tfd.MultivariateNormalTriL(mu,cov)

print (d)
print (d.sample(2).numpy())

tfp.distributions.MultivariateNormalTriL("MultivariateNormalTriL", batch_shape=[3], event_shape=[4], dtype=float64)
[[[-5.1101803   0.8596026   3.06403109  2.1083541 ]
  [ 3.69974774 -3.93829998 -1.6255217   2.22765852]
  [ 3.32589202  2.69049452 -4.32719606  1.2584832 ]]

 [[-5.93515374  0.91773734  2.35591075  2.1539073 ]
  [ 6.17373777 -3.15403426 -0.98656476  2.23261819]
  [ 1.86732188  2.18168707 -5.64504685  1.01497814]]]

The `Independent` distribution object#

Allows us to transfer dimensions from batch_shape to event_shape.

This might be useful when designing specific final distributions. Recall that batches are sets of distributions of the same family but with different parameters, so they are independant, whereas events belong to the same multivariate and maybe dependant.

# Initialize a single 3-variate Gaussian.
mu = np.random.random(size=(3,4,2)).astype(np.float32)

cov = [[ 0.36,  0.12],
       [ 0.12,  0.29]]

mvn = tfd.MultivariateNormalTriL(
    loc=mu,
    scale_tril=tf.linalg.cholesky(cov)
)

mvn

<tfp.distributions.MultivariateNormalTriL 'MultivariateNormalTriL' batch_shape=[3, 4] event_shape=[2] dtype=float32>

s = mvn.sample(100000)
s.shape

TensorShape([100000, 3, 4, 2])

plt.imshow(np.mean(s, axis=0).reshape(2,-1))

<matplotlib.image.AxesImage at 0x7f42493bed90>

../_images/660b28e8b08f0717a63f9e0400013347f0dd7528cb5140d870d10d1f784211ea.png

plt.imshow(np.std(s, axis=0).reshape(2,-1))

<matplotlib.image.AxesImage at 0x7f42492ca280>

../_images/1a4a403e00119ebea88a0f1ad031e9967870adeee6e27f9358447fa350f6032a.png

redistribute dimensions

mi = tfd.Independent(mvn, reinterpreted_batch_ndims=1)
mi

<tfp.distributions.Independent 'IndependentMultivariateNormalTriL' batch_shape=[3] event_shape=[4, 2] dtype=float32>

observe that samples have the same shape and individual distributions

s = mi.sample(100000)
s.shape

TensorShape([100000, 3, 4, 2])

plt.imshow(np.mean(s, axis=0).reshape(2,-1))

<matplotlib.image.AxesImage at 0x7f95845115e0>

../_images/94010422f5327007724ddae9c90f546cba054184e242fc081b5127d125982523.png

plt.imshow(np.std(s, axis=0).reshape(2,-1))

<matplotlib.image.AxesImage at 0x7f9584442340>

../_images/d1be5b46f118364d60ab072abf58c8f7bf867749899670861f0de90987c00e4b.png

Understanding shapes

Contents

Understanding shapes#

Multivariate distributions#

Combining batch_shape and event_shape#

Making sense of shapes#

The Independent distribution object#

Combining `batch_shape` and `event_shape`#

The `Independent` distribution object#